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ABSTRACT 


Will  there  be  a  breakthrough  ir.  the  field  of  information  retrieval? 

One  authority  in  that  field  has  said,  "No."  This  paper  adopts  the 
opposite  viewpoint,  and  speculates  on  what  the  elements  of  such  a 
breakthrough  might  be  if  it  were  to  occur. 

Several  breakthroughs  in  other  fields  are  scrutinized  in  order  to  high¬ 
light  the  factors  which  characterize  and  energize  sudden  expansions  of 
new  technologies.  These  factors,  plus  some  factors  specific  to  the 
field  of  information  retrieval,  are  then  extrapolated  into  a  "plot  for 
a  breakthrough." 
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HOW  TO  PLOT  A  BREAKTHROUGH 


"Breakthrough"  is  a  word  with  military- journalistic  origins  which  was  apparently 
first  used,  in  a  large-scale  way,  to  pertain  to  the  breach  of  the  German  lines 
in  Normandy  during  the  summer  of  19^ »  and  the  subsequent  race  by  Patton  and 
other  elements  of  the  Allied  armies  to  Paris  and  points  east.  Since  the  war, 
the  word  has  been  repeatedly  used  to  refer  to  a  phenomenon  which  has  bee  one 
especially  frequent  in  the  last  two  decades— the  "technological  breakthrough." 

In  analogy  to  the  military  breakthrough,  there  is  the  entrenched  enemy  of  a 
formidable  technical  problem,  the  breach  made  by  some  powerful  new  approach 
or  technique,  and  the  exploitation  spearheads,  which  may  fan  out  in  many 
unexpected  directions. 

As  is  usually  the  case  with  glamour- infused  terms,  vulgarization  sets  in, 
so  that  anybody  and  everybody  can  benefit  from  the  "psychological  fall-out" 
of  the  usage  of  the  term.  When  so  many  people  use  a  word  like  "breakthrough" 
in  connection  with  promotional  efforts,  the  use  of  the  ward  soon  becomes 
suspect,  and  thoae  few  who  still  ssploy  the  word  in  an  honest  way  are  often 
themselves  forced  to  find  etill  another  word  to  describe  what  they  are  talking 
about. 

Nevertheless,  the  word  "breakthrough"  is  Just  as  much  here  to  stay  as  is  the 
phenomenon  to  which  it  refers.  Furthermore,  if  we  understand  what  a  break¬ 
through  is,  and  why  and  how  it  takes  place,  we  are  in  a  much  better  position 
to  use  the  word  meaningfully  and  to  shape  research  and  development  efforts 
In  ways  to  increase  the  speed  of  incipient  breakthroughs  as  they  are  recognized. 
One  can  go  even  a  step  further  and  assert  that,  knowing  all  the  major  character¬ 
istics  and  possibilities  of  a  given  problem  area,  one  can  actually  plot  a 
breakthrough.  It  Is  admittedly  a  tricky  thing  to  do— and  failure  is  over¬ 
whelmingly  probable— hut  success  is  possible. 

What  axe  the  defining  attributes  of  a  breakthrough?  As  we  enumerate  them 
here,  in  each  case  we  are  going  to  look  at  some  pattern  of  events  in  history 
which  displays  the  enumerated  attribute  along  with  other  typical  breakthrough 
attributes . 

The  first  attribute  is  the  relative  shortness  of  time'*'  over  which  a  break¬ 
through  materializes  and  grows,  to  maturity.  This  can  be  illustrated  by  what 
was  perhaps  the  first  genuine  breakthrough  In  the  history  of  man— the  control 
of  fire.  Man  is  said  to  be  a  tool-making  animal,  but  his  development  In  the 
use  of  tools  proceeded  with  painful  slowness  over  tens  of  thousands  of  years; 
the  use  of  tools  by  humans  could  be  called  a  breakthrough  only  in  the 
perspective  of  geological  time. 


1  This  attribute  is  descriptive,  whereas  those  to  follow  are 
or  causative. 
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for  electricity  vas  latent — but  perceived  and  ■understood  by  a  large  enough 
number  that  the  breakthrough  vas  practically  obliged  to  happen. 

The  latent  need  constitutes  only  half  of  the  driving  force  behind  a  break¬ 
through;  the  other  half  cones  from  the  emergence  of  clearly  superior  nev 
methods,  the  third  attribute  of  a  breakthrough.  Arguable  superiority  is  not 
enough;  superiority  must  be  incontrovertible.  In  the  contest  to  vhich  ve 
have  just  alluded,  electric  light  versus  gas  light,  the  problems  of  con¬ 
structing  untried  and  unfamiliar  electric  power  distribution  networks  were  so 
great  that  such  enterprises  aignt  not  have  been  attempted  except  for  the  fact 
that,  once  the  electric  light  had  been  shown  feasible,  almost  anyone  could 
see  that  the  use  of  electricity  vas  superior  in  almost  every  important  respect 
to  the  use  of  gas,  even  when  a  battery  vas  toe  electric  power  source. 

In  modern  times  one  of  the  best  illustrations  of  this  principle  is  found  in 
the  transistor.  In  the  ever- shorter  time  spans  in  vhich  breakthroughs  are 
taking  place,  ve  find  less  than  a  decade  has  elapsed  between  the  first  public 
announcement  of  the  transistor's  invention  and  the  large-scale  manufacture  of 
transistorized  computers  and  other  devices;  and  today,  after  only  fifteen 
years,  transistors  are  the  basis  of  a  billion- dollar  industry. 

In  part  this  rapid  progress  vas  energized  by  the  requirement  for  miniaturiza¬ 
tion  and  low  power  in  airborne  and  satellite  cctqnJters.  But  the  transistor, 
quite  apart  from  the  special  needs  of  modern  weapons  systems,  has  such  general 
superiority  over  vacuum  tubes  that  a  breakthrough  could  have  been  expected  in 
any  event,  once  the  characteristics  of  the  transistor  vere  made  known  to  the 
technological  community. 

The  vacuum  tube  is  fragile,  cumbersome,  and  wasteful  of  power;  its  fragileness 
leads  to  unreliability  and  to  sensitivity  to  shock  and  vibration;  its  cunfeer- 
soneness  requires  complex  instruments  such  as  cooputers  to  be  bulky  and  spread 
out  over  entire  floors,  like  the  stacks  in  a  public  library;  its  high  consusp- 
tion  of  power  leads  not  only  to  unnecessary  operating  expense,  but  also  creates 
added  engineering  problems  connected  vith  heat  removal,  especially  when  much 
electronic  equipment  is  required  to  be  in  a  confined  space,  as  in  an  aircraft 
or  aboard  ship.  But  the  transistor  has  none  of  these  disadvantages. 

The  transistor,  then,  is  cue  of  the  more  remarkable  examples  of  across-the- 
board  superiority  of  a  new  technique  over  an  old  one  that  makes  a  breakthrough 
"in  the  cards."  Hew  many  other  examples  are  found,  in  history,  of  rapid 
technological  expansion  following  widespread  realization  of  superiority  of 
method?  It  would  be  difficult  to  cite  them  all.  There  are  irrigation  {3000  B.C.), 
movable  type  (H50  A.D.),  steam  power  (1770),  railroads  (l30i),  telephony  (lo7o), 
radioactive  tracers  (193*0 ,  amplification  by  stimulated  emission  (masers,  lasers, 
1951) •  These  are  only  a  few  of  the  mare  spectacular  exarples. 
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A  fourth  attribute  of  a  breakthrough  is  adequacy  of  available  supporting 
technology.  Perhaps  there  vas  a  need  for  automatic  c onpnjtational  machinery 
in  1823;  perhaps  Babbage's  method  of  mechanical  computation  vas  clearly 
superior  to  anything  in  sight;  -unfortunately,  the  supporting  technology 
required  for  the  f  easibility  of  Babbage’s  system  did.  not  yet  exist. 

There  are  other  less  dramatic  examples  of  situations  -where  the  need  and  the 
superior  method  were  both  available,  but  where  the  actual  breakthrough  was 
postponed  until  the  required  supporting  technology  was  developed.  'Though 
Lee  de  Forest  invented  the  three- element  vacuum  In**  which  made  radio  reception 
feasible  in  1906,  radio  broadcasting  did  not  come  to  pass  until  1921;  manu¬ 
facturing  methods  vere  Just  not  good  enough  prior  to  that  time  to  assure 
reasonable  reliability  of  performance  in  an  electronic  device  even  as  staple 
as  a  radio. 

The  iaparttuice  of  "adequacy  of  available  departing  technology'"  Is  not  to  he 
underestimated  merely  because  it  is  given  here  as  the  fourth  attribute,  rather 
than  as  one  of  the  first  three.  There  have  been  times  when  either  the  reality 
of  the  need  or  the  superiority  of  method  was  ncft  yet  clear  enough  bo  be  seen 
by  those  who  needed  most  to  see  it,  but  where  supporting  technology  together 
with  a  handful  of  imaginative  entrepreneurs  turned  the  tide  toward  breakthrough- 

Uhen  intercontinental  rocketry  began  in  the  early  1950’s,  the  accent  both  in 
the  U.  S.  and  Russia  was  on  liquid-fueled  systems;  solid-fueled,  rockets  were 
thought  of  as  nothing  more  than  glorified  artillery  shells.  The  U.  S-,  however, 
had  the  good  fortune  to  have  a  highly  developed,  organic  polymer  technology, 
which  Included  synthetic  rubber  and.  plastics.  Sane  of  the  organic  polymer 
industries  caw  that  there  was  a  future  for  them  in  solid  propellants  and  began, 
mare  ar  less  an  their  awn,  to  plot  what  would  «*wrt.n«i~]y  become  a  breakthrough. 

The  major  problem  was  to  influence  the  people  who  held  the  pursestrings  in 
the  Defense  Department  to  finance  the  more  expensive  phases  of  development  of 
large  solids  fueled  boosters,  arid  it  was  at  point  that  the  '"clear  superiority 

of  method"  slowly  began  to  assert  itself-  Even  here  progress  was  difficult, 
because  those  factions  with  vested  interests  in  liquid-propellant  system 
development  were  most  persuasive  in  their  arguments  to  the  effect  that  solid 
fuels  were  inherently  inferior.  One  problem  was  the  seemingly  inadequate 
thrust  of  solid  fuels.  Another  was  the  problem  of  cutoff  control;  stopping 
a  solid- fueled  rocket  when  the  proper  velocity  was  achieved  seemed  about  as 
difficult  as  stopping  an  erupting  volcano. 

As  we  now  know,  problems  of  the  above  sort  were  not  insoluble,  but  required 
only  a  reasonably  steadfast  application  of  engineering  know-how  and  ingenuity. 

In  Russia,  with  its  somewhat  underdeveloped  petrochemical  industry  and  its 
trailing  position  in  the  development  of  compact  nuclear  warheads,  the  case 
for  solid- fueled  intercontinental  missiles  was  never  strong  enough — during 
the  1950's— to  justify  the  developmental  investment. 
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Doubtlessly  most  si’  as  remember  the  first  newsreel  shorts  of  a  Polaris  missals 
being  tired  from  its  submerged  mobile  launching  pal,  that  we  saw  was  slightly 
incredible  a  a  ridiculous  "hep  bottle''’  ttresting  itself  out  ©f  the  aa*a, 
■drunkenly  inciting  itself,  and  talcing  off  Idle  a  lourth  of  Duly  rochet — an 
great  contrast  to  the  agonizing  slowness  of  the  initial  flight  of  an  Atlas 
or  a  Titan-  Do  one  could  then  deny  than  a  breakthrough  bad  achieved  maturity. 

As  preparation  for  discussing  '"low  to  Hot  a  Breakthrough,’"  we  review  the 
■attributes  so  far  discussed,  with  an  indication  that  each  must  1®  taker  into 
account  if  the  plot  is  to  succeed. 

1.  Sedative  Shortness  of  Time-  This  attribute  was  presented  first  because  it 
describes  -shat  our  .goal  is  a  to  her  log  to  a  state  of  vide^read  application  a. 
new  technology  -otiose  principles  are  understood  at  the  outset  by  perhaps  no  more 
than  half  a  dozen  people,  in  a  period  of  time  which  is  gaff®  short  in  cungerison 
with  typical  -technological  gestation  periods.  Dnmmercial  ataadc  power  is  am 
•example  of  a  norfbreakthrough.  in  the  sense  that  ,  though  the  principles  of  tech¬ 
nology  were  understood  in  1 J&2,  essentially  twenty  years  were  required  to  achorre 
the  state  of  ''widespread  epplicEiian-'"  1 n  iodKy"s  tomes  a  seLf-resperiiog  break¬ 
through  should  late  no  longer  than  fire  years- 

2-  latent.  Deed..  This  factor  -was  presented  second  because,  though  it  is  a 

driving  force  fox  a  breakthrough,  it  is  a  most-  difficult  lirnrat  ito  hBrneas-  If 
one  attempts  to  ''plot  a  brezktbrnu*dh'"  and  fails,  it  is  very  likseiy  baesaase  he 
dees  not  -.understand  bow  to  assess  and  capitalise  con  the  -freed  factor.  A.  teadmfinal 
person  may  have  the  educational  gtisi  ~i  f's-ratj-rr-.T,  and  t®  •prv^  vflw try  -itihww 

is  a  breakthrough  to  b£  made — and  ever  be  right— -but  be  fatally  <of  the 

quBsi-pdliticai  processes  by  which  strategically  placed  people  becone  oonvSnmetd 
that  the  need  does  exist  and  nan  be  met  through  a  new  technology.  mane 

than  a  few  are  induced  to  develop  and  apply  the  new  t»c2maoILoEy»  the  support 
required  in  the  early  phases  of  breakthrough  cannot  be  mustered.  Tfrfis  flwaSp 
sees  so  maqy  "’competing  breakthroughs'’  and  so  many  high-pressure  types  tieim-fining; 
consumers — big  and  smEll — what  they  need,  that  a  breakthrough  without  an  adequate 
public  relations  apparatus  is  beaded  for  stagnation  before  it  start®. 

3-  Clearly  Superior  Methods.  We  think  of  a  man  as  fort  urate  indeed  if  he 
stumbles  onto  a  method  or  technique  which  loans  h£®d-«nd-sbnuILder£  in  efftectire- 
ness  -over  prevailing  arts-  But  it  is  not  necessarily  a  case  of  hndk;  such  tedb- 
nigueE  can  be  deli'oerately  sought,  and  found  'this  is  why  we  finance  what  is  known 
as  ’"research"'])-  The  reason  sc  few  are  rewarded  in  the  search  is  not  so  much  that 
most  people  are  -unlucky,  but  that  mast  people  do  not  persist  lor®  enough  in  the 
search — they  are  content  to  invest  their  nine  and  energy  in  the  first  max^infflJljly 
superior  idea  they  encounter.  To  put-  it  -another  way,  if  me  aims  for  Mars  and 

in  the  process  reaches  the  moon,  he  becomes  satisfied  with  the  lesser  goal  because 
he  never  dreamed  that  he  would  be  roe  of  tbs  privileged  few  to  stand  on  the  moon. 
The  avenue,  then,,  to  finding  clearly  superior  methods  is;  don"t  be  satisfied  with 
the  moon- 
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■ggiaggdC--  *fe  adsgrr  rto^x  to*  tcwer  of  toe-  togfral  cciEsnrtgg,;..  *«fcr  is  to 
iac  a.  breagtorcsga.  ~3g  set  tcrirrsdf  a.Tr°agyf:  Tog-  hardisgrs:  fa  sdtogistz- 
'rn7g-  sstoagre  'jT-rurng  srstsssg  and.  Issgngges^.  fa  slsc  xfparigfg-  vi&ergliL. 

~  -iesr  tat  barrier!  It  fa  trcbably  to  fas  fiti  to  anytofrs  fa  toe  crcvtocs  of 
•,--a«Ertj  -;n»-  zEXToia  " — -sosr  as  gxc  be  ggZIgg  toe  'iftri-  sctoasre'*  teaasi 
saegxftogf Ty  fa  iel  ~~~  toed  aasraas  ftozcltotor  toe  retrienal  to'  netorsjL 


toast  x  aggacaiecsga. 


aas  agcaar  me 


-mgtigga:  xa .  Ibe  ■i_an 
set.  !amrtr  toe  smnggctog 


icfwssa  32 


25.,,  3CE.  a 


»£  ZSB..  cacgc  CTT- 


J.  ■?«»»■•■— rwf-  -xrm-  -*n~  —  2EV  ,-g--  Z3^ZSZ%  2mm.  <C3t5oig — ""ey  to  n«*-  rto  yr~  ayr  _ 

Ibere  are  a  -xtrfety  to  etosttoir  aey~  ay  -crfaa  tana  can.  be  <rfsais.3»=ai  on.  scenes 
amf:,.  say,  tcrrectad  arr  raaad  by  m  editor  totoorr.  ecer  assrEcsg  tc  gp.  fXteaagfe 
fine-  arcara— aaa reccftoc- toast  cycle  totoix  fa  sc  amsiersasE  ■wiSJt  toe  aaal  to-cart 
cedtotoues-  Ibese  -says  range  fm  fcsaasaeats  sadx  aa  ferrm.tr cnto * s.  WlMexLg 
fa  tsirfto.  -aitgrtog  teletype  resaages  car.  erg  to'scla-yed  to.  toe  eperator  Star  oar— 
rgetiaa.  refers  actoaH y  betor  seat  rcct  „  tc  mrs  adjnamrad  scapes  accspttog 
aerrcfl  torait- 


3.  tofrd  aesr  tsabmlcgy  tsads  to  saasEaacxn  time  already  exlsttog  fa  t fie  fafaltaftfa 
todiscryc  tc  sddflftE.  to  toe  -ssrtocs  ao^-esed  rectories  tofea.  digital  faff-  taeafcy  -ae 
f2se  toe  ttotoelaatofa  erfat  reacer,  fto  gctornrfal  Is  g rsac  because  ft  remaers 
toto  famto  ss-  assy  far  toe  cancniaar  as  toe  aegg.  sys  careers  ft  easy  far  tone 
brafa-  atoortonstoLy,  ^rfac  reacers  aae  art  jet  able  to  acceat  ary  and.  ala 
arfatec  cetertal  afssed  befrre  toea,  bacacaa  sfT  toe  bewficeafag  •saaSefty  cT  type 
fiats  aafta  a  test  is  araera  grtattosx  sad.  toere-f.tr*  becaiause  c£  toe  toLr2fcn£iLty 
to'  ieneicctoit  r  aeracter-  fater  gr.stet  fca  togto  as  gsrer gjllaed  as-  -tost.  cT  toe  brinir 
eje  acd.  bra  to'  - 

If'  *  are  sew  at  fere  .aay  real  abases  to  ftoiftog  a  '"tlsarly  srpartor  netoed,"’'  we 
meet  Krej  toe  latest:  seed  as  -sell  as  toe  asectote  eTaflsble  sapprrtlag  teto>- 
ratLcgfes.  It  to  a  greet  teal  barter  tc  asaess  reeds  ta.  a  f  neat  11  f fa ,ffrn!^u*tf<rjn' 
retrtorral  tbaa  ft  to  to  toteatory  era-flacle  tec  toctogtos :  toe  opes,  prt-fesatoaal 
Itteretare  describes  toe  Latter  to  as  sxf  tetafl  as  v:  are  imr  IfxeLy  to  ask, 
but  toe  ftrrer  are  tactrtoed  to  a  ■sey  -rrtos  rails  tartly  be  aryitotog  bto  super - 
f  total.  If  to  to  act  betacae  ve  are  ptcr  tbaerrera  to  tense,  reeds,  but  acre 


because  there  are  so  sary 
are  e- ft  ter  ertrecely  dffff 


t  e_ecet ta »  e'jes 


nrftf.ta.jL  ei* 


■afcleffi 


tc  cbeer.t  cr  acttally  toi  aerrsc le . 


Tot  tax'  c  rely  tc  vtet  pertfe  cay  they  tee- it  yen  can't  eet  eras?  teftoftiTt 
tctcfjis tots  fres  tost  they  to  tee.  Ecv  ertoc  vc  tore  predicted,  fer  e./agp-le, 
the  effects  to  teletbccec  ard  anrctccfc flee  ct  ccrtorrers,  to  adrerr.e  cf  ctefr 
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insaartMox-  &T  Qra&ffizt  Sell.  usas  arly  tos  Sjrsfc  to  Mjsroscgr  toe  -tteUg^Bnaa — 

ggggy  zra>~ggftgTTg  user  since  toex  ftas  "SEacOTesceoi.  tog  tglegliirag"'  fa  Ms  sum 
grrroia  tocLS*.  by  fffaffair  oat  6aw  to  fa&^gseite  to®  teHq^itona  fato  Ms  awm 
genton  q£  U&Ftag-  2o£,  f£  cue  bod.  astof  a.  asMdffirfc  off  a  3te«?  McsBaaS  ttuosoi  fa 
to®  lS7G'"s  ,to®to®r  Be  a,  -fcalggfanng,,  "Hie  totM.  oao&Bblly  a®  sang- 

toxag  Iiker  "logic,,  Mf  steer.  Thereto  Mom  aniy  geas!®  tealMa"  a®  Stt  1st.  A  harijjr 
car  BarfLy  get  a  fs^rto  toe Je  t&mg  Mtocrefr.  sane  gacaa-fia^agfiMia  canes  aemroinfi.  to 
g®ss  toe  tone  jaa&errn."  ataxit  Ms  stok.  cmut-  15®  lasstfc  tMng  I  watt  to  Mali. 
irffeaal  hell.  MoMa"  rfgftfc  fir  toe  mfcafflig  ®"  Bees®  ffoeMa"-'" 


T to  aeu-C'iarlr  to  oeedwssseaBsinBrfc.  Mtea  toe  greatest  jpra&aMlftey  <aff  scnnceiaE  to 
rhe  c~m»  ’as&fdi.  aeagTOa  frnw  gwag^Tj^  to)  aartaaOLDy  dfsEsSBesr  toe  Tufcfilffjy  aaff  "Etotoi 
new  and  lii  farflxtoass-  Sa^-<toeerwBttSiani  cam  be  aff  greet  tojprartaace.,,  hamma® 
toe  toCKEtoirf tigr  off  nrrtr.firK-^  gfi ■fr.Mirrg’*;  to  ifwwru.grrwfgriittj  Itfttfiy.  fisc 

esgerfafly  tone  to  «x  area  MS®  iaffimnaMfoni  settef .enaR*  wfter*  peregil®  tosterartt 
efto  toe  "gf-teteni  record  as  fmgfgflgnkllgM  aatoi  store  toe  dtoaerwr  baas  muy  nmme 

tosoces  to  ’TTinrrf  tny  Hrff<c  asm.  ^rffHTrm»frnrnrrwCT«7rMi*gr  ffiwri^i!K  -frflinn  to)  lmnrnfitorr  ItHrnanp- 
■rf  ctottos- 

<2fa®  to  toe  fErste  tefefags  a  cciraicrfgriMsaa  seBiP-Mfcssswsr  wUUL  become  aware  to  to 
teBate  itogre  to  a  retofcfsnafcfg)  betUawau  Me  ffbeqpHmry  ®ff  Ms  .wrairritars  ami.  toe 
Msteanre  between:  Ms  momel  TOri  Itacaytflto  ami.  toe  medlanc  to  be  sasncbiBfi;  to 
effect  tefiere  to  am  tonrerse  praportelna  between  ffmgussEy  ami  (itotenre- 

TTVnp  tocrtest  liftff  ■cfrrrtn^  flgT  jlmC <SS3^Kttl2i9BL  JHWJPE^  (OPcrarSj  to  iPBnamej  egtoto  Bnto 
am r  'arod-  Iff  toe  todSsrae&tom  gegntoei.  is  mot  aasOy  apccaatole  to  toe  mfirnffl,, 

cjrr^  T<tr<rik«  far-  imr'  •WnnlWmrfflatt.Wown  •igr||pr<>  »Hlt3hffm  iwibbc  IaB®tt2a  Off  bff«  (rBnaflir- 

Ms  saocksbelff,  Ms  motcbociks,  Ms  eegge^gattierre  file,  Ms  stork  to  jproffesMifltoTl 
£<3fSCl JMJLXp  QC~3 Iff  be~toe  *r*&nrrrrm ..  1TBn>  ]pOe  off  -Urn  OBDe*S  'ineirffo1^  gjt 

to  effect  am  acadltory  mmeej  atfto  a  ISgrptoaki  access  Mae  to  csne  angmmto-  A 
gsecscm  i—j  vsek  toe  «thw>iii »  Smgf  qvn^L  ^nn»  -to  aoca  toam  a  ifi.xirjwm  ffongns 
begnaai  ■~irtnrfrfT*f  off  m«k  miiPfifT*  e- 

ftai  off  tMs  (SczeiEy  ffMJy  mr  tem  miff  jnfr.  be  to  m^ffjgffiMwrffni^g  <tofflces  wltSnto  a 
~g*£ffcmg  to  50  ffeet  or  so  to  Ms  ie&,  efitber  to  talk  to  collitafipars  tor  to  cmsmit 
cocks  cr  jomrtoto  kmoanx  to  be  to  toeto  jpocsessioDu  rerihaps,,  toem,  only  am  t wo 
cr  toc^e  ocssMosas  (bartog  toe  w^ktofg  toy  Is  oce  reugntoei  to  Temtaare  beynoifi 
Ms  faane&late  celg&fccrftiood  to  searcbi  to  toffoena tlom-  As  *  c.coscletoioaas  selff- 
:b*ffsrT«r  l  oottoe  ayselff  rtoittog  toe  ooupscy  library  mot  mare  iiam  twice  a 
veec.  Itoer  '"distsratf  forays  are  typtoally;  lj  to  coDsnalt-  bocks  wMcBa  I  ke«p> 
at  rcme,  2)  to  vrlte  letters  to  ImSlMtoa JLs  kmowm  to  bare  speeifflc  toffociaa- 
tier,  3>  to  peruse  special  (f sod- library)  oollectloms  of  toconects  possessed 
07  cere  dlstemtiy  located  coULeagaes^  In  aod  owt  to  toe  conpainy,  or  k)  to 
at  teed  professional  meet  legs- 


Is  addition  to  rereeliog  toe  lenrerse  proport looallfty  betvees  freyoeocy  of 
reared  smd  di.starre  to  search  aedla,  self- otoserrat ton  also  rereals  toast  one 
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gffitts  amc&i  acgpacta^gg  ffinmm  Urrawaimg;  wmygg  t’rrifTteegMEti’nrn  Is  ires  toe  fcum£~  ]I  toeere* 
marift*-  aitoermiiHPE  afctangtrs  "to  cransteructt  icrrxi  TTgrtintlmfm  a.  csrd  fiiUffi 


®ff  tift®  asaEttfflrUir  cnf  1057  sSUr®;.,  touii  ftaardL  Mterally  ttosit  it  was  mat  wartH;  tfcff 
effffimrtt.  ®EpennedL.  amt  affEaxrttJtegEadljjr,,  I  •was  nrefint refining;  £m  ny 

Ww&fl  a  togttfosr  juntas:  of  itfe  ©mrttffimtts  aff  ajy  ceSEE ®e  I  c-cniTitS  axanstmart  T3y 

ftmfrfj-  gteam  esBSgr  g g gg hjt  mtrti  ana  fgssy  mzax^„  tliffi  rnffiTfram  -S-mi^r  hra<r  frhry 

«2sreHEttag«  caff  out  gspuarlng;  fftossy  ^en£s£aas  ©scrar  ImfeKiircg;  toBarMaffiDagy,,  car  njuSaar 
wtteto  ftv»«ffiiiTg;  ®r  tew  may  fteraStlngs  frfrnrmTlfa  a  item  toe  fSOteL. 


Htte  "teed  sEHulyMs'1'  wa  dawe  mate  s®  Iter  gqjja&s  to®  am  itetoaresstlng;  new 

guft Hffnm  ffatr  viirT^  ~i '■urtirrg^  ttftig-  asgffigagE  nBBm"s  aer rm&s&z  -ft®  f  TTiffnTTTMBttii rmm-  t®  fiirfjpnrnrg 
tUffi  'flumnttlitty  anrii  qpaHtoy  taff  "tote  ii mgi iatagaM irrrrwggagTrfe  aE&£*  ~n>mgtoito 

®ff  tote  iLmffiannfflttS :Tnn  <raEfcaunE2r,!S  jrncTiit-nffir,  aEfc  Mis;  TTHn^tc  prrTliiVry  «rrr^<nm|pTl fi tftiFg-, 

tow©  toMmgs::  Ii))  It  p«cnnlttffi  a  large  rmmftgg-  ®ff  Mtos  off  "Imff  'annaltlam  a&aat 
~H TTnfftnffitth.it  .-Hrpnnf>1  to©  (rams  Imtt®  tote  gxmfrrmgTte  acnetfttast  access  aturSMary  storage; 

Ms  laclfe  ®f  jjgMMMffim  Im  earnsmTittfcg;  totes®  samares  wscuiM  te  segxami  omaiy  t®  Ms 
larlfc  off  fimftrfnMttiimrn  I®  I MaamngpBtSii®  Ms  ®wm  imBmfl,  2))  IBecaose  of  tote  M$to 
ftepueny  wlt&  rtfel  tote  ©amsanaer  wiiTH  rrmunmnU-lt  -ttftg  sources  wfitMis  amnx 
lazStifa*  2®  will  ffinnm  a  ''Smeasal  Unj&gjg'"  off  tihasie  son rc«s  w*C®Su  Ian  tfime  wflll 
®r®atly  iimc.ggasg  Ms  gffffldgngy  Im  tfoffft'T  Troc-, 


(RdE*  ttteDj,  Is  ta*tt  ttlc  imutsannaMom  ®Diii®im8srK$  gmsstest  ladtract  meet 

Is  ffanr  anma-mS  nmato-cg-lgsffi  access  t®  M^ally  org^aMzel,  IMtonatlmKMeBp  mefeerML 
wMcfiu  Jim  Man  maartimaill  mnnttact  w*tHD  taat  gmrtllom  ®ff  tBae  wrarlrt"®  wlfttem  geeoert 
partlaoeTut  tt©  Bds  wazrM  CTMLess  eongoce  c®m  soggest  am  alttermattlwe  meet >&^soft&ea<Ss, 
■we  w?UUL  attnptt-  tMis  aes  a  aa^Jcc  jpriwmfiTg  fim  tiic  jH®t  t®  ltaeffilfc  tBnmBtflg6i- 

TTOny  mcjctt  sttej)  Is  tig  qpes t  ffiar  a  daamly  sugeElanr  nefftrf.  Mr  feme  tune 
aonOaMe:  teeltoffilfaeles.  Me  Bonne  a  resaomaMe  Bq^pctBoesIs  ®ff  meet.  Sow  cam 
tftagr  naow  B*  Baronigftlt  to®et8aer  t®  incmease  tBae  nmntta.11  ©onritact  Biiettweemi  ttBte 
~ff  rrhfrfw^iprit'.'fl  [mm  amt  tBae  wnrtttB^esH  b Hecostt? 

We  must  mar  ranlmt  omrselves  amenr  0 £  tBae  most  UUsely  reason  wftjr  *  plat  far  a 
BmreaMteoxq^D  will  Cell— tBae  tlfflcnall^r  ©f  Barmesslai®  tBae  latesct  meet  as  a 
trlndla®  fonree  faar  tifor  BrreaScItBiarioBti^iu  IL/et  ms  soppese,  toy  wa^y  off  tiTi'n™r<hpfftlCTa;> 
ttoatt  crae  <c©aaDelTes  a  c (anproter-jarodme:  «*t  ImSeac  wdnlcfia  Is  clearly  superior  to  any 
ottBaer  Inxteas—nsBEaual  cr  emtamatl '*•— —  it w.  anigSirtt»  Sow  womli  stactoi  am  Imanowaftor 
proceed  t®  ccmrlsc'e  tBae  world  of  misers  of  tBae  clear  superiority  of  tBae  ImdeacT 
TBae  most  direct  wsy  womM  toe  to  Mrcmtlate  tBae  Imdeat  to  a  large  mtiantoer  of 
m»ers;  toec.«mse  If  tBae  super lorlfty  of  tie  Irdez  Is  clear.  It  cmgBat  to  toe 
especially  clear  to  ©me  wto©  nases  It. 

llfefortmmtely,  anucto  derelopanemtal  wori  and  capital  ImrestaBemt  Is  recptlred 
toefore  cme  cam  erer  circmlate  saacto  a  prcduact  to  e  large  enoagto  mrjgfcer  of 
people  to  create  s  "’©elf-snastalaiag  fcrealttBarotmgto. w  Tbe  imeapemsive  alternative 
Is  to  ccmvimce  strategically  placed  persons,  s/too  control  dollars  amd  nasm-Baomrc, 
tmt  tbe  Index  is  clearly  superior.  At  tiais  .level,  bowever,  erjperiority  may 
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n-rrh  'be-  $0  el wtr  as  it  arurTif  be  far  s.  user-  af  the  finished  product.  Because 
these-  people  have  greater  resgon sibi I 1 ty,  they  also  have  greater  caution. 
Moreover.  they  wnli  "be  constrained:  ta-  evaluate  the  index  according  ta  intel¬ 
lectual.  frrftqrta  rsfeber  than-  according  ta-  personal  experience  in  usage. 
Phrthermore,  an  niniiwre-To^eti  fdew  fs  never  as  attractive  as  a  finished,  product,. 
am?  a  m»rr  t &  specially  suspect  be  m afrmg  bis  idea,  is  "clearly  superior.  * 

T*frf  sk  is  a  fact  gjrfnir  h«-gr  to  be  i-iint^nded  with:  ideas  which  later  prove  to  be 
ciecrEsfve  too  after,  -ire  unreeagplzed  and/or  apposed  in  their  infancy.  But  one 
tfrfng  mn«t  be  recalled  from  our  very  first  sentence  about  latent  ns*- 1:  that 
such  need  is  accompanied  by  a  receptive  climate  where  unumaat  welcome  is 
extended  ta>  new  ideas — the  difficulty  is  that,  in  modern  times  especially, 
the  hospitality  of  this  <-T-fn»ette  is  Erased  by  a  saturation  with  many  ideas 
which  are  not,  so  goad;  in  other  words,  the  climate  can  be  so  receptive  that 
bad  ideas  ore  frequently  purchased,  canning  receptivity  to-  become  weighted 
down  with  shept-feiam  and  caution. 

Receptivity,  however,  is  still  extant,  and  if  we  can  locate  its  surviving 
foci  we  can  mobilize  the  psychological  capital,  and  eventually  the  budgetary 
necessary  to  give  the  Trre»irfrhenii^h  miimwvtnnn  T*fre  most  receptive 
possible  ally  is  one  who  already  has  a  technology  packaged  qp>  and  far  sale. 

Just  as  the  polymer  industry  was  Inciting  far  solid* fuel  rocketry,  so  might 
some  present-day  Inihjartry  be  looking  far  the  clearly  superior  retrieval,  idea. 

If  they  can  be  satisfied  as  to  the  clear  superiority  of  it,  they  will  Join 
the  breakthrough  platters  energetically  in  developing  It  and  promoting  it. 

Sow,  assuming  far  the  immieiTt  that  there  does  exist  some  benefactor  with  a 
"’technology  far  sale"*  who  will  support  any  "clearly  superior  idea"  which  we 
might  see  fit  to  advocate,  and  far  the  moment  not  worrying  about  this 
benefactor  might  be,  we  now  feel  free  to  uncover  the  clearly  superior  path 
to  retrieval  of  documented  information. 

What  should  we  pot  within  "’arm's  reach"  of  the  tnfarmatiao  user  at  his  desk? 

Hdw  should  we  fill  his  "qplck  access  auxiliary  storage"’?  The  answer  which 
suggests  Itself  is:  we  should  surround  the  user  with  highly  'Organized  and 
condensed  summarizatiocs  of  the  literature.  They  must  be  handy  enough, 
palatable  enough,  and  rewarding  enough  that  the  user  will  consult  them  frequently 
because  the  more  frequently  he  consults  them  the  better  bis  "mental  Index" 
will  became,  thus  in  tarn  lowering  Ms  inhibition  in  consulting  his  "arm’s 
length  store"  and  increasing  his  effectiveness  as  a  retriever.  Bandy  enough, 
palatable  enough,  and  rewarding  enough — we  take  up  each  of  these  requirements 
in  turn,  in  the  hope  of  establishing  the  specifications  for  the  clearly  superior 
retrieval  instrument. 

Bandy  Enough.  Part  of  the  handiness  is,  as  we've  realized,  arm's  length 
proximity.  Books,  pamphlets,  or  file  cards  are  handy.  Microfilms,  and  other 
such  ultra-condensed  representations,  allow  us  to  place  more  bits  of  informa¬ 
tion  within  arm's  reach  of  the  user,  but  at  a  double  cost:  1)  some  increase 
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in  use  inhibition,  and  2)  much  less  feasibility  of  "mental  indexing,” 
requiring  recourse  ta  an  external  indexing  scheme  with  all  its  time-consuming 
fetters.  Microfilm,  of  course,  could  serve  as  a  somewhat  larger  and  less 
accessible  office  storage,  and  if  a  correspondence  were  to  exist  between 
book,  or  file-card  (macro)  storage  elements  and  the  mere  capacious  elements  of 
tbs  microfilm,  then  the  "mental  index”  giving  access  to  the  macro- storage  will 
also  serve  in  part  as  an  index  to  the  microfilm  store.  This  is  optional,  how¬ 
ever,  and  it  appears  a  reasonable  stance  that  the  "arm's  length  store”  should 
he  composed  of  macro- elements. 

Palatable  Enough.  The  elements  of  the  user's  arm's  length  stare  should  be 
easy  to  interpret,  and  preferably  even  fun  to  interpret..  A  large  part  of 
our  success  in  having  any  retrieval  system  work  is  in  persuading  people  to 
use  it  frequently  and  voluntarily.  If  there  is  joy  in  using  a  system,  then 
little  persuasion  is  necessary.  It  is  a  sad  fact  both  far  traditional  reference 
systems  (bibliographies,  abstract  Journals,  Indexes)  and  for  cos^uter- generated 
products  (auto-abstracts,  permuted  title  Indexes)  that  their  use  is  a  dismal 
way  to  spend  one's  time.  In  a  pre-ccEputer  society  this  might  have  been 
inherently  unavoidable,  tfe  must  now  ask:  can  cooputers  help  us  to  generate 
reference  systems  which  are  a  pleasure  to  use? 

Rewarding  Enough.  Many  otherwise  uncriticized  retrieval  schemes  have  often 
fallen  flat  because  a  user  had  great  difficulty  In  obtaining  the  documents 
to  which  the  system  referred  him.  Generally  speaking,  in  any  system  we  must 
expect  the  document  to  he  less  accessible  than  its  condensed  representation. 
However  small  the  access  time  and  the  inhibition  in  the  user's  reference  to 
his  arm's  length  store,  the  purpose  of  the  vhole  idea  is  defeated  if  each 
search  trail  leads  simply  to  «  reference  to  a  document.  This  is  not  so  merely 
because  the  document  might  take  an  hour  or  a  day  to  deliver,  hut  even  more 
Importantly  because  the  user  msy  not  want  to  read  a  vhole  document  in  each 
case.  In  short,  our  arm's  length  retrieval  system  must  provide  information 
rewards  of  a  brief  and  crisp  character,  so  that  the  user  is  given  some  infor¬ 
mation  whether  or  not  he  orders  the  document.  Thus,  even  If  the  user  never 
orders  a  document,  he  will  still  derive  enough  value  from  his  arm's  length 
system  to  motivate  him  to  use  it  frequently. 

The  formulation  which  now  suggests  itself  is  to  have  computers  generate,  from 
the  full  texts  of  documents,  condensed  representations  which  can  he  combined 
in  an  organized  way  and  issued  as  books,  pamphlets,  or  card  sets  available 
on  request  (or  through  selective  dissemination)  to  users  far  use  at  their 
desks. 

There  are  many  aspects  of  such  a  system  which  we  do  not  have  space  here  to 
discuss.  For  exanple,  there  is  purging  and  updating  to  be  considered;  we 
have  room  only  to  opine  that  the  user  should  be  given  the  option  as  to 
whether  these  functions  are  under  bis  personal  control  (requiring  added  effort 
on  his  part)  or  under  some  form  of  automatic  or  institutional  control. 
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The  only  aspects  of  the  "arm's  length  system"  which  we  want  to  discuss  at 
length  in  this  article  are  those  germane  to  our  attempt  to  consciously  plot 
a  breakthrough.  We  have  just  decided  what  the  focus  of  the  breakthrough  is 
to  be:  a  highly  organized,  easily  accessible,  information-rich  auxiliary 
storage  within  arm's  reach  of  the  information  user  at  his  desk.  The  next 
questions  are  "What  will  the  elements  (books,  paaphlets,  card  sets,  etc.)  of 
this  store  contain,  and  how  will  they  be  produced*" 

What  we  first  oust  think  about  is  organization.  The  arm's  length  system  has 
two  basic  levels  of  organization,  gross  organization  (arrangement  of  the 
elements)  and  detailed  organization  (arrangement  of  sub- elements  within  each 
element).  On  the  gross  organization  level,  we  have  appreciated  that  the 
desired  low  inhibition  in  use  cf  the  system  can  best  be  realized  if  the  user 
is  allowed  to  arrange  them  willy-nilly  and  to  rely  on  his  own  familiarity  with 
where  things  are  in  his  office  environment.  If  the  user  is  a  compulsive  type 
who  likes  to  carefully  and  deliberately  organize  and  Index  his  environment, 
this  should  be  his  prerogative.  Our  system,  if  it  is  to  be  truly  popular, 
oust  accommodate  a  variety  of  temperaments. 

We  assume  that  most  users,  compulsive  or  otherwise,  will  unavoidably  rely 
heavily  on  their  "mental  index"  of  their  auxiliary  store;  this  leads  to  the 
requirement  that  there  must  not  be  too  mazy  elements  in  the  store;  it  argues 
that  books  are  preferable  to  paaphlets,  which  in  turn  are  preferable  to  cards. 

On  the  other  hand,  hooks  are  bully,  relatively  expensive,  and  not  amenable 
to  updating  and  selective  purging.  A  tentative  "optimum"  might  be  packets  of 
paaphlets,  where  paaphlets  can  be  grouped  according  to  the  user's  own  conception 
of  hew  the  universe  should  be  organized.  The  physical  means  of  grouping  could 
be  box  files,  binders,  vertical  partitions,  or  whatever  means  might  appeal  to 
a  user. 

As  we  go  from  the  gross  organization  level  to  the  level  of  detailed  organisa¬ 
tion,  we  also  travel  from  a  realm  where  it  is  easy  for  the  user  to  arrange 
things  to  suit  himself  to  a  realm  where  there  are  far  too  many  units  to  be 
arranged  by  the  user.  As  we  go  from  the  large  to  the  small,  the  "mental  index" 
principle  becomes  less  and  less  effective,  and  the  willingness  of  the  user  to 
organize  things  declines  rapidly.  And  yet  at  the  detailed  level  organization 
is  of  critical  importance. 

How  are  we  to  achieve  organization  at  this  level?  At  this  point  we  have 
recourse  to  one  of  the  most  mighty  of  our  available  new  technologies:  the 
software  technology  known  as  "digital  computer  programming."  By  a  variety 
of  ways  we  can  feed  the  text  of  entire  articles  into  conputer  storage  and, 
by  programmed  means,  perform  a  practically  infinite  variety  of  operations 
on  the  stored  text. 

One  of  the  operations  we  can  perform  is  to  group  documents  according  to 
similarity  of  word  content,  which  approximates  to  a  reasonable  degree  grouping 
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by  topical  similarity.  There  are  working  programs  now  available  which  can 
do  this,  although  the  art  of  "automatic  grouping"  is  just  beginning  and 
cannot  be  called  an  "available  technology"  in  itself.  Nevertheless,  it  is 
becoming  apparent  that  pushing  this  art  is  an  important  component  of  the 
breakthrough  we  are  plotting.  It  is  not,  however,  an  indlspensible  component; 
there  are  grounds  for  believing  that  if  the  software  available  for  "automatic 
grouping"  remains  in  its  present  primitive  state  it  could  still  serve  as  a 
factor  in  the  breakthrough. 

We  need  not  be  unduly  disturbed  if  similarity  of  word  content  is  not  in 
precise  correspondence  with  topical  similarity.  For  one  thing,  "topical 
similarity"  is  hard  to  judge  when  we  deal  with  articles  in  the  same  field. 

There  are  too  many  arbitrary  subjective  criteria  by  which  one  might  decide 
that  article  A  is  more  similar  to  article  C  than  it  is  to  article  B  or 
vice  versa.  For  this  reason,  matching  of  choices  of  wards  by  the  authors 
of  the  articles  is  as  good  a  basis  as  any  for  determining  degree  of  similarity. 
Furthermore,  this  method  of  determining  similarity  has  the  indisputable 
advantage  of  lending  itself  to  automatic ity. 

How,  numerous  researchers  in  the  field  of  information  retrieval  have  studied 
document  grouping  according  to  similarity  of  word  content,  but  no  breakthroughs 
have  occurred  in  consequence.  Why  not?  Conscientious  as  the  studies  might 
have  been,  the  importance  of  representing  the  groupings  in  a  way  that  is 
visible  to  and  interpretable  by  the  information  searcher  has  not  been  seen. 

All  systems  advocated  on  the  basis  of  these  studies  have  had  high  inhibition 
factors.  When  we  see  that  our  approach  to  retrieval  must  involve  low  inhibition 
coupled  with  enhanced  motivation,  we  are  on  the  road  to  a  "clearly  superior 
method. " 

Our  first  tentative  decision  was  to  provide  the  information  user  with 
packets  of  pamphlets  for  the  user  to  arrange  as  he  sees  fit.  Our  second 
tentative  decision  is  that  the  organization  of  the  material  in  the  pamphlets 
should  in  some  way  be  based  on  word-content  similarity  as  determined  by 
appropriate  computer  programs.  What  this  means  in  particular  is  that  an 
of  the  documents  represented  on,  say,  page  108  of  a  given  panphlet  should 
bear  a  closer  resemblance  in  word  content  to  each  other  than  they  do  to  docu¬ 
ments  represented  on  other  pages.  (Such  a  state,  by  the  way,  does  not  have 
to  be  literally  achievable;  similarity  relations  do  not  necessarily  structure 
themselves  in  such  an  obliging  way.  All  that  matters  is  that  it  be  "approxi¬ 
mately  achievable.") 

We  are  now  ready  for  a  more  detailed  description  of  a  typical  panphlet: 
what  all  is  in  it  and  what  does  it  look  like?  Like  any  reasonable  panphlet, 
this  one  will  have  a  "table  of  contents"  at  the  beginning;  the  table  will 
be  in  large  part  conputer-generated;  but  it  will  also  be  heavily  edited. 

Without  the  latter  operation,  the  breakthrough  cannot  get  off  the  ground, 
in  this  decade  at  least.  We  must  not  forget  that  "palatability "  is  an 
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important  factor  in  enhancing  the  motivation  of  the  user  to  frequently 
consult  his  auxiliary  storage.  Unedited  conputer  output  is  likely  to  be 
unpalatable  for  a  long  time  to  come. 

After  the  table  of  contents  we  come  to  the  basic  material  of  the  pamphlet. 
Each  page  will  have  a  dozen  or  so  representations  of  topically  close  docu¬ 
ments.  What  will  they  be  like?  Will  it  look  like  an  abstract  Journal? 

If  it  is  literally  like  an  abstract  Journal,  we  can  bet  that  the  break¬ 
through  will  not  go;  I  know  professional  people  who  have  in  their  office 
cartons  full  of  abstract  Journals  which  they  never  look  at. 

Part  of  the  trouble  is  the  "under organization"  of  these  Journals;  but  also 
it  is,  I  think,  that  there  is  something  repulsive  about  a  page  full  of 
abstracts.  We  must  construct  a  page  which  will  draw  the  user  in.  In  the 
center  of  the  page  (or  at  the  top)  will  be  a  large  word  or  term,  in  bold¬ 
face  type  £_inch  high.  It  will  be  a  word  or  term  suggestive  of  what  the 
documents  represented  on  that  page  have  in  common;  it  will  very  likely  be 
a  content  word  (or  term)  which  all  of  the  documents  have  in  common  as  a 
key  word,  i.e.,  a. word  of  high  frequency.  Radiating  outward  from  the 
centrally  placed  word,  thin  lines  will  lead  to  words  in  smaller  type  which 
reflect  what  certain  sub-groups  of  the  documents  have  in  common.  Finally, 
for  each  document  will  be  shown  a  single  sentence— a  conclusion,  fact,  or 
opinion. 

Would  this  sentence  be  picked  out  of  the  document  automatically?  Largely 
yes.  The  actual  process  would  probably  be  something  like  this: 

1)  A  computer  program  selects  six  or  eight  sentences  from  document 
text,  based  primarily  on  richness  of  word  content.  Ways  of  doing 
this  have  already  been  worked  out  (2,3). 

2)  An  editor  selects  one  of  these  and  rewrites  it  so  that  it  can  stand 
by  itself,  without  requiring  the  context  of  the  document  for  its 
easy  interpretation.  In  an  advanced  index-generating  system  this 
could  be  accomplished  at  a  display  scope  with  a  light  pencil. 

3)  The  sentence,  once  designated  by  the  editor,  is  assigned  by  the 
computer  program  to  that  page  of  the  index  which  contains  its 
closest  topical  neighbors. 

There  are,  to  be  sure,  dubious  aspects  of  this  scheme.  The  authors  of  the 
documents  may  be  upset  at  the  thought  of  having  a  single  sentence  of  theirs, 
out  of  context,  displayed  to  thousands  of  information  users  of  whom  only  ljd 
might  ever  order  the  entire  document.  I  am  sure,  however,  that  these  same 
authors  would  feel  no  pain  upon  reading  similar  sentences  drawn  from  the 
text  of  other  authors.  If  there  is  uncertainty  about  the  truth  or  signifi¬ 
cance  of  the  sentence,  one  can  always  order  the  article.  The  fact  id,  in 
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most  eases  it  vill  not  matter.  The  vorld  is  already  constructed  in  such  a 
way  that  the  user's  picture  of  99$  of  the  documents  to  which  he  is  exposed 
is  fragmentary  indeed.  Furthermore,  as  a  system  like  that  just  described 
undergoes  evolution,  the  authors  may  themselves  be  alloved  to  designate 
or  specially  compose  the  sentences  which  will  represent  their  articles. 

Another  dubious  aspect  of  this  scheme  is  that  it  sounds  like  a  tremendous 
amount  of  human  labor  would  have  to  be  put  into  the  index- generating  process 
if  an  editing  operation  is  required  for  every  entry  in  the  pamphlet.  On 
the  other  hand,  the  editor  pictured  in  our  scheme  has  to  do  far  less  work  per 
document  than  a  conventional  abstract  writer  would  have  to  do.  If,  through 
computer  use,  we  make  a  90$  saving  in  editorial  effort,  it  is  almost  as 
large  as  the  100$  saving  we  would  make  under  complete  automation  of  index 
generation.  The  question  we  have  to  ask  is:  what  does  the  residual  10$ 
of  editing  effort  accomplish? 

In  this  case  it  accomplishes  a  great  deal,  because  without  it  we  would 
not  have  a  breakthrough.  The  requirement  for  "palatability, "  as  we  hove 
already  noted,  is  crucial  enough  to  Justify  a  considerable  investment  In 
editing.  The  investment  in  editing  is  aeutally  but  a  small  added  cost 
which  we  regard  as  an  insurance  premium— that  the  millions  of  computer 
program  operations  required  to  analyze,  compare,  group,  organize,  and 
condense  documents  will  not  have  been  expended  in  vain. 

What  else  is  in  the  pamphlet?  In  addition  to  the  "palatable"  portion  of  the 
panphlet,  to  which  the  user  will  refer  most  often,  there  will  be  a  "business" 
portion,  which  need  not  be  palatable  and  can  be  automatically  generated 
without  a  requirement  for  editing.  The  "business"  portion  will  contain  two 
sections.  One  will  be  a  standard  bibliography  containing  all  of  the  infor¬ 
mation  the  user  would  need  in  order  to  find  the  document  itself.  The  other 
section  would  list  panphlets  containing  more  detail  per  article,  in  case 
the  user  elects  to  have  a  more  detailed  index  of  an  area  of  special  interest 
to  him.  Finally,  there  is  an  index,  generated  automatically  with  perhaps  some 
editing,  which  will  provide  alphabetical  entry  to  the  panphlet,  thereby 
sipplementing  entry  on  the  basis  of  tqpical  similarity. 

Now  we  take  a  second  look  at  the  "arm's  length  systan,"  in  quantitative 
and  qualitative  terms.  Quantitatively,  we  see  perhaps  200  panphlet s  within 
easy  reach  of  a  user,  each  of  which  contains  references  to  perhaps  a  thousand 
documents:  200,000  "palatable"  references  in  a  highly  organized  state. 

Such  a  system  could  well  accommodate  references  to  all  of  the  articles 
in  one's  own  profession  for  the  past  decade,  plus  a  good  number  of  references 
in  related  fields  cf  the  user's  choice. 

Qualitatively,  what  do  we  have?  What  it  amounts  to  is  a  highly  structured 
collection  of  proverbs.  Each  such  proverb  (which  need  not  necessarily  be 
factual)  is  the  crisp  "information  reward"  which  we  decided  earlier  would 
be  required  to  enhance  user  motivation.  There  is  then  not  only  low  inhibition 
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in  using  the  file,  but  also  a  rather  strong  temptation  tc  browse.  Many  amusing 
surprises  would  lurk  in  the  file,  such  as  "The  2wirtz  Frocess  requires  phospate 
rock  feed  with  low  silica  contamination"  next  to  "The  advantage  of  the  Ewirtz 
Process  is  its  tendency  to  precipitate  and  discard  silica  impurities,  "  Or 
we  might  see,  "It  is  unthinkable  that  we  can  have  automatic  indexing  without 
first  developing  machine  searching"  adjacent  to  "From  a  research  viewpoint, 
a  study  of  automatic  indexing  is  an  essential  prelude  to  the  solution  of  the 
machine  searching  problem. " 

To  reiterate:  when  searching  the  literature  becomes  convenient  and  enter¬ 
taining,  people  will  search  the  literature.  Indeed,  five  years  from  now  It 
could  be  commonplace  to  hear  supervisors  saying,  "Jones,  will  you  please 
stop  searching  the  literature  and  do  same  work." 

The  question  we  finally  ask  Is:  Who  will  provide  the  money  and  the  man-hours 
to  start  this  breakthrough  on  its  vayT  Who  is  the  benefactor  with  the 
"technology  for  sale”  who  will  support  a  "clearly  superior  Idea "l  It  is 
highly  likely  that  such  a  party  will  be  found  within  the  publishing  Industry, 
The  motivation  of  this  party  would  be  pure  and  simple:  Is  this  "dearly 
superior  Idea"  a  good  way  for  me  to  make  money? 

The  possibilities  for  market  expansion  are  large.  Consider  the  following 
quotation  of  John  Markus  (4)  of  the  Me  Grew- SL11  Bock  Company: 

"...Money  is  at  the  root  of  most  index-publishing  problems.  Individuals 
rarely  buy  published  indexes,  partly  because  of  their  high  and  con¬ 
tinuing  cost  and  partly  because  they  usually  have  to  go  to  a  library 
anyway  to  get  the  documents  cited  in  an  index.  The  market  for  index 
volumes  is  thus  small  because  it  consists  largely  of  libraries.... ” 

When  we  talk  about  hundreds  of  panphlets  lining  the  offices  of  tens  of 
thousands  of  professional  workers,  we  are  talking  about  &  huge  market  which 
does  not  now  exist.  The  major  question  of  interest  to  the  publisher  is: 

What  will  It  cost  to  produce  these  Indexes  and  will  people  buy  them? 

Questions  such  as  "Is  similarity  of  word  content  a  sound  basis  on  which  to 
organize  references?"  would  tend  to  be  of  little  interest  to  the  publisher; 
it  would  be  of  much  more  interest  to  him  that  automatic  organization  of  any 
kind  is  feasible,  and  here  again  the  question  would  be  "How  much  will  it 
cost?" 

Many  publishers  will  find  themselves  already  having,  as  part  of  their 
normal  operation,  most  of  the  text  they  publish  in  digitalized  form.  This 
puts  them  in  an  excellent  position  to  develop  a  pilot-plant  operation 
among  their  own  clientele.  Expenses  for  computer  time,  which  in  the  scheme 
we  describe  are  still  quite  large,  might  be  villingly  borne— as  part  of  a 
development  effort— if  cost  projections  for  future  computer  expense  rates 
decline  steeply  enough.  And  this  is  not  the  only  possible  source  of  cost 
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lowering.  Another  very  substantial  decline  could  come  from  finding  programmed 
methods  vastly'  more  efficient  than  any  we  now  visualize.;  but  one  couldn"t 
hope  to  realize  such  gains  without  being  willing  to  '•undertake  a  development 
effort. 


A  publisher,  furthermore,  would  tend  to  be  uninterested  in  such  questions 
as :  "Is  machine  searching  of  text  superior  to  machine  indexing'?"  If  he  -can 
sell  pamphlets  to  someone  who  works  twenty  feet  from  the  output  end  of  a 
literature-searching  computer,  that’s  igood  enough  for  him.  Be  would  not  he 
interested  in  the  outcome  of  rigorous  evaluation  experiments  .which  determine 
whether  Index  A  Is  better  or  worse  than  Index  B-  But  if  Index  B  sells  fester 
than  Index  A,  he  Is  then  very  much  .convinced  that  Index  B  Is  Better.  The 
publisher  is  not  alarmed  when  he  reads  that  the  number  of  literature  produciqg- 
and- consuming  scientists  is  increasing  exponentially.  Indeed,  he  is  quite 
pleased,  because  It  implies  for  him  an  exponential  increase  in  business  volume. 


The  publisher  would  not  take  part  in  earnest  debates  about  whether  or  not 
we  should  have  mammoth  centralized  information  .centers,  as  the  Bussians  have. 
He  would  instead  take  comfort  in  the  realization  that  his  customers  would 
be  on  the  average  several  hundred  miles  from  airy  such  conceivable  information 
center.  When  an  information  center  is  finally  opened,  it  could  only  mean 
to  the  publisher  a  more  comprehensive  supply  of  raw  material  from  which 
to  generate  pamphlets. 

If  publishers  do  indeed  decide  at  this  time  that  there  is  an  '"information 
retrieval  market,'"  from  which  money  is  to  he  made,  there  are  some  interesting 
evolutionary  possibilities.  'One  such  possibility,  for  'example,  is  that  the 
balance  between  man  and  machine  in  the  production  of  indexes  may  shift  hack 
toward  the  man,  rather  than  toward  the  machine.  In  a  doctrinaire  climate 
where  people  are  satisfied  with  nothing  less  than  '"fully  automatic  high- 
quality"  processing  of  language,  such  a  trend  could  not  .occur.  But  It 
could  occur  if  a  publisher  found  that  increased  application  of  editorial 
skill  resulted  in  increased  sales. 


The  question  of  "How  to  Plot  a  Breakthrough"  cannot  be  answered  in  the  text 
of  this  paper.  It  is  much  easier  to  write  about  plotting  a  breakthrough 
than  it  is  to  plot  one  and  have  it  thereupon  take  place-  In  one  sense  It 
is  about  as  foolish  as  writing  about  "How  to  Make  a  Million  Dollars'"  without 
having  first  made  it.  Indeed,  even  one  who  actually  makes  a  million  dollars 
is  not  necessarily  able  to  give  the  public  the  straight  goods  on  how  to 
make  a  million.  If  such  a  millionaire  were  in  full  possession  of  the  truth 
about  human  nature,  and  were  honest  in  revealing  his  knowledge,  he  might 
say:  "Don't  even  try  to  make  a  million.  There  are  only  a  few  people  who 
can  do  it,  starting  from  scratch,  and  I'm  one  of  them." 

Therefore,  even  if  I  had  already  successfully  plotted  a  breakthrough  I  would 
not  be  by  that  token  in  a  position  to  advise  others  on  the  matter.  On  the 
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■dtiher  hand,  the  theme  beer  enphasioed  herein  that  if  brsrikttoos^h 
plotting  is  ait  al  n  passIM®,  itt  cam  only  be  acc-onplished  ly  studying  farmer  • 
breakthroughs  and  mn&erstamding  the  forces  behind  them.  If  one  then  losdfes 
far  (equivalent  forces  in  a  current  prdbTsm  ®t«a,  be  stands  a  good  chance 
of'  finding  them. 

Spotting  these  -forces,  of  conree,  dees  not  auftcmfcicailly  lead  it©  a  successful 
forecast-  ©ne  cam  levee  am  unrecognised  inperiamt  eiemerit  corit  of  Me  oal- 
oMaitions — an  eHemesit  which  lead  Mstacy  in  a  opaglettgly  cdifftftoertt 
direction  than  ore  rnfljgft*.  predict-  Sfcybe  there  is  some  nnrerognii  ued  virtue— 
for  <essnipl®~in  mot  up  width  literature-  1  ftawe  found  that  nmj  *  y 

professional  psqpl®  acre  •HTntt^TYnHrSgftTwB  a®  their  productivity'  ay  thaslr  ower- 
awareness  of  daw  vnirfo  the  rest  of  tie  world  Maws-  WOl  more  effective 
contact  with  tie  literature  only  intUani^tt®  them  mesne?  Ho  they  inttaiihfwMy 
Screw  this,  cdo  ith^y  therefore  try  to  assoid  the  literature?  ttmh:  jag«wr 
’has  tried  to  imclmdg  the  prdbafole  p^rihoUogilcaiD  ciaoacieriistiES  of  tie 
iufaemtian  user  in  its  analysis.,  bsit  it  nm&  matt  iacxe  gone  meady  flto 
enoqgfcu 

'Whatever  tie  case,  it  is  a  saffe  statement  KRwt  practitioners  of  research 
gLTifl  in  the  of  ^tiijfi^nrnprft^mnn  retrltaoai  Bane  steadfastly 

TffjgTlaiirtwd  to  that  the  mtffonS  payrbmTltmpy  of  ft  Re  it  n  iffinwwmtt-ii  firm 

user  form  the  central  element  in  ids  'ttdhoQe  retrieval  picture-  Same  ©rndgfcce 
atikgaafilgdgiBcnts  of  tMs  Base  teen  made,  leacfiflng;  to  library  roe  studies  amd 
retrieval  system  emanation  prefects-  These  studies,  off  course,  Mad  mry 
little  iflgjft*-  on  -sfeat  people  will,  do  wfaen  given  radically  mew  tools,  such  as 
modern  technology  is  makting  amOaMe-  If  I  be  so  hbid  as  to  &b&  it, 
perhaps  cur  studies  of  the  user  ore  mot  sufficiently  '"fftomflamnaftai)  - "'  (One 
is  hard  put  there  diys  to  win  snpporrt  for  such  ressarrMbecauBe  it  is  '"too 
abstract’"  <«cnfl  "WrffmtPTWfit’ipg-- '"  it  is  "too  remote  from  practical  ? f^ati tom- 1,1 

And  for  seme  stowage  reason  tie  ifaorlason  of  "Jpractical  arolimatiom111  canmtinMES 
to  be  bleak-  He  ',,‘ftriftittm#»r1‘taB'n  ■«"'  <af  ftrunm  mature  however,  wiOl  catcb  up 
with  us  whether  we  Dmow  them  <nr  mot. 


Ms  could  be  a  large  jm.ru.  cuff  the  stuay  off  wily  breakthroughs  happen-  ffito 
they  happen  suddenly  because  they  are  long  overdue?  The  principle  of  lasers 
was  described  by  F.  (E.  Mcutennems,  a  physicist,  3®  years  ago.  Why  was 
there  not  a  gT^s-imai  .deveTcpment,  rather  than  ®  sadden  breakthrough?  $olid- 
fueled  roc  bets  are  as  old  as  gunpowder-  Why  didn"t  the  ©erasams  ((■*&©  once 
led  the  world  in  the  technology  of  organic  chemlstiyj)  develop  Bolaris-Mbe 
missiles?  Electricity  .and  its  properties  were  known  about  in  the  times  oof 
Kapoleon.  Why  did  It  take  a  century  to  reach  the  "’sue  of  electricity"'? 

The  ballocr  of  'dogma  and  mental  rigidity  is  inexorably  (expanded  by  the 
pressure  of  events-  'The  tension  increases.  Then  along  comes  .one  guy  with 
a  pin.  The  result:  poof 2  Another  breakthrough  has  occurred. 
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